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We present a simple mass model for the lensing galaxy in the gravitationally 
lensed quasar 0957+561. We represent the galaxy as a softened power-law sphere 
(SPLS), a generalization of the singular isothermal sphere with three parameters 
■ — po, the central density, 9 C , the angular core radius, and r), the radial index 

which is defined such that mass increases as r v at large radius. As in previous 
studies we approximate the galaxy cluster surrounding the lensing galaxy by 
means of a quadratic potential described by its convergence k and shear 7. A 
feature of the model is that it does not require a large central compact mass. 
We fit the model to a recent high resolution VLBI map of the two images of 



in 
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0957+561. The data provide a number of independent constraints and the model- 
fit has six degrees of freedom, which is a significant improvement over previous 
models. Although the reduced x °f the best-fit model is only 4.3, nevertheless 
we obtain a tight constraint on the radial index, 1.07 < 77 < 1.18, at the 95% 
+3 . confidence level. Thus, the galaxy has mass increasing slightly more rapidly than 

isothermal (r? = 1) out to at least 15/t kpc. Since the light from the galaxy 
follows a de Vaucouleurs profile, we deduce that the mass-to-light ratio of the 
galaxy increases rapidly with increasing radius. We also obtain an upper limit 
on the core radius, namely 8 C < O'.'ll or linear core radius < 330/i -1 pc. 

We use the model to calculate the Hubble constant H$ as a function of the 
time delay Atba between the two images. We obtain 

H Q = (60.5l£f) (1 - k) (Ar BA /1.5yr) _1 kms -1 Mpc -1 , or 

'82.5i|^) (1 - k) (Ar^/l.lyr)" 1 kms^Mpc -1 . 



Once At B a is measured, this will provide an upper bound on H since k cannot 
be negative. In addition, the model degeneracy due to k can be eliminated if the 
one-dimensional velocity dispersion a of the lensing galaxy is measured. In this 
case we find that 

H = (60.512;}) (<r/322kms~ 1 ) 2 (ATBA/l-5yr) -1 kms^Mpc" 1 , or 
= (82.5lg^)(«7/322kmB~ 1 ) 2 (Ar B A/l-lyr)" 1 kms^Mpc -1 . 



-2- 



We find that these results are virtually unchanged when we investigate the effects 
of ellipticity in the lensing galaxy and dumpiness in the lensing cluster. 



1. Introduction 

In two seminal papers, Refsdal (1964, 1966) showed that there should exist a time delay 
between flux variations of multiple images of a lensed background source, and demonstrated 
that the time delay is inversely proportional to the Hubble constant Hq. Gravitational lenses 
thus can be used to determine H independently of traditional distance-ladder techniques. 
The lens method of estimating H requires a measurement of the time delay and a determi- 
nation of the mass distribution of the lens. 

A particularly promising candidate for this technique is the first gravitational lens dis- 
covered, the so-called "double quasar" 0957+561 (Walsh, Carswell, & Weymann 1979), which 
consists of a pair of lensed images of a z = 1.41 quasar separated by 6" on the sky. The 
galaxy responsible for the lensing, denoted Gl, was discovered by Stockton (1980). This 
galaxy is a bright cluster elliptical at redshift z = 0.36, residing in a cluster of galaxies which 
also contributes to the lensing (Garrett, Walsh, & Carswell 1992). Long-term monitoring of 
the optical ( |Vanderreist et al. 1992| ; (3child 1990Q and radio ( [Lehar et al. 1992]) light curves 



of the two quasar images has provided strong evidence for a time delay, though there is still 
some uncertainty regarding the actual value of the delay (Press, Rybicki, & Hewitt 1992b, c; 
Schild fc Thomson 1993| ; |Pelt et al. 1995| ). The observational uncertainties in the other ob- 



servables such as the image positions, the relative image magnification, and the source and 
lens redshifts are small, leaving the lens mass distribution as the chief remaining obstacle to 
estimating Hq using this system ( [Borgeest fc Refsdal 1984] ). The mass distribution of the 
lens is difficult to measure directly, and has to be constrained using the observations of the 
lensed images. 

It is customary to postulate a simple functional form for the lens mass profile, and to 
adjust the parameters of the model so as to obtain the best fit to the observables, namely 
the quasar image positions and relative image magnifications. Such calculations have been 
done by various authors in the past (|Young et al. 1980]; [Borgeest fc Refsdal 1984|; Greenfield, 



Roberts, & Burke 1985; |Kochanek 1991|; Falco, Gorenstein, & Shapiro 1991, hereafter [b'GS 



Bernstein, Tyson, & Kochanek 1993, hereafter |BTK| ) using the data available at the time. 



We present here a new model of 0957+561 which improves on previous work in two respects. 

First, we employ a new parameterization of the lens mass, namely the softened power- 
law sphere (SPLS) model, which allows us to explore a wider range of radial mass profiles 
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than in earlier studies. Our model space includes the standard isothermal sphere model (e.g. 
Turner, Ostriker, & Gott 1984) as a particular case, and one of our aims is to use the data 
to determine the degree to which the galaxy deviates from an isothermal radial profile. We 
also include a core radius for the galaxy and represent the surrounding cluster by means of 
a quadratic potential as proposed by FGS. 

Second, we make use of a larger set of data constraints than in any previous study. FGS 
for their modeling used high-resolution VLBI maps of 0957+561 ( |Gorenstein et at. 1988a| ) 



which resolved features of an inner radio jet extending ~ 50 milliarcseconds from the quasar 
core of each image. This permitted them to derive a 4 x 4 relative magnification matrix 
between the two quasar images which they used to constrain their model. Recent A18cm 
VLBI mapping of 0957+56lA,B by Garrett et al. (1994, hereafter |G94j ) revealed even greater 



detail in the fine structure of the quasar images. The A and B inner jet regions are now 
resolved into five centers of emission with sub-milliarcsecond accuracy in the positions. In 



addition, |G94j report a gradient in the relative magnification tensor between the quasar core 
and the end of the inner jet. This provides two additional data constraints which we include 
in our model-fitting. As a result of the additional data, we have a better-constrained lens 
model, with six degrees of freedom in the data fits, as against the previous studies by FGS 
and Kochanek (1991) which had only one degree of freedom. 

In §2 of the paper we briefly introduce the notation and approximations employed in this 
study, as well as the basic lens equations that allow us to test mass models against constraints 
from observation. In §3 we review the observations of 0957+561, paying particular attention 
to the high-resolution VLBI mapping by |G94j which provides the majority of our model 



constraints. We introduce the SPLS lens model in §4 and derive its lensing properties. We 
then employ the lensing equations of §2 to show explicitly the dependence of model-predicted 
observables, including the time delay, upon our various model parameters. For comparison, 
we do the same also for the FGS lens model. In §5 we describe the results of our model-fitting, 
both with our SPLS model and with the FGS model, and including the effects of ellipticity 
in Gl and dumpiness in the cluster. We derive confidence limits on the lens parameters and 
obtain bounds on H . In §6 we address one of the chief difficulties in obtaining tight H 
bounds from 0957+561, namely the uncertainty regarding the amount of lensing contributed 
by the cluster. We describe how this uncertainty can be removed by measuring the velocity 
dispersion of the galaxy, as shown by FGS, or by measuring the shape and velocity dispersion 
of the cluster. We summarize the paper in §7 and discuss prospects for further improvements 
in the method, both with 0957+561 and with other lens systems. 



-4- 



2. Lensing Geometry and Notation 

We employ a Cartesian coordinate system on the sky with the origin at the center of 
mass of the lensing galaxy Gl, the x-axis taken positive to the east, and the y-axis positive 
to the north. We define position angles to be zero toward the north and increasing eastward. 
We employ the standard angular diameter distance appropriate to a Friedmann universe. 
For an observer at redshift and a source at redshift Zj, the angular diameter distance is 
given by 

= O^'u? f ' T' g,) - ^(l + fta,)" 2 , (2-1) 

t>i Hq ilfi (1 + Zi) (1 + Zj) 

where Ho = lOO/i km s _1 Mpc -1 is the Hubble constant. A proper length £ at Zj subtends 
an angle 9 = £/D(zi, Zj) at Zj. It is customary to use the abbreviations D d = D(0,Zd), 
D s = D(0,z s ), and Dd. s = D(za,z s ) when referring to the angular diameter distances from 
observer to deflector, observer to source, and deflector to source, respectively. We also 
introduce the effective lens distance 

D = D d D s /D ds , (2-2) 

which appears in the lensing equations below. 

For simplicity, we assume Qq = 1 in what follows. The sensitivity of the results to 
fi is rather small ( FGS ). For the 0957+561 lens, D increases approximately linearly with 



decreasing Q , to a value ~8% larger for Q = than the fiducial value for Qq — 1. A non- 
zero cosmological constant A can have a more important effect on the results (e.g. Turner 
1990), but we do not explore the dependence in this paper. 



Using standard notation (e.g. Blandford fc Marayan l99"2"; Schneider, Ehlers, & Falco 



1992), we represent angular positions at the distance of the background source (the "source 
plane" ) with the vector f3 and angular positions at the distance of the deflector (the "image 
plane") with 0. The lens mass deflects light rays at the image plane through an angle which 
we represent with a.. These quantities are then related via the lens equation, 

(3 = 0- a (0), (2-3) 

where the reduced deflection angle ct(0) is related to the true deflection angle ct(6) at the 
image plane by 

(X — {—— \ Oi = d. (2-4) 



DJ \D 

The ray deflection function ct(6) may in general allow multiple solutions 6i to the lens 



equation [2-3] for a given f3. If a source happens to lie at such a /3, we observe multiple 



images of the source at positions &i, a gravitational "mirage" 
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The lens deflection a(6) is sensitive only to the surface mass density of the deflector. 
With £ = D<iO representing linear position in the image plane, we may express the lens 
deflection angle as the two-dimensional gradient of a potential ip{$,), 
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The potential ip is related to the lens surface mass density £ according to V|^(£) = 87rG£(£). 
Equivalently, we may express the deflection in terms of the Green's function of the Poisson 
operator, 

For a radially symmetric surface mass density profile £(£), the above equation simplifies to 



2 ' 



(2-6) 



V 



c 2 e 2 y s ' 

where M(£) is the projected mass of the deflector within cylindrical radius £. 
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Gravitational lensing not only causes images of a background source to appear at differ- 
ent positions, but the images are also magnified or demagnified. In general the magnification 
is anisotropic, and is described by a symmetric ( Bourassa fc Kantowski 1975 ) 2 x 2 magni- 
fication tensor [A4 % ] which is given by 



'd0~ 




da 




d/3 




l2 ~d6 


0i_ 
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Here X 2 is the 2x2 identity matrix. We arrive at the leftmost expression via differentiation of 
the lensing equation | |2-3| | . Because we cannot view the unlensed source, we cannot determine 
from observations of the system. If however the background source is multiply imaged 
and the images are resolved, then we can directly measure the relative magnification tensor 
from one image to another, [Ai 1 ^] = (dOi/dOj). The relative magnification is related to the 
lens deflection function a via equation ||2-8|| above: 

-1 r 



M* M 



da 
l2 ~d6 



To - — 



da 
d6 



(2-9) 



Although the magnification tensor is symmetric, the relative magnification tensor is not 
and thus will have four independent components. It is customary to refer to the relative 
magnification tensor in terms of its eigenvalues M± t 2 and corresponding eigenvector position 
angles 4>i,2 ( e -9- FGS). With unresolved images, it is not possible to measure the full relative 
magnification tensor. However, the flux ratio of the two images gives the magnitude of the 
determinant of [A^* J "]. 
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3. Observational Constraints 

3.1. Image Positions and Magnifications 

The highest-resolution images of 0957+561 have been obtained by means of radio VLBI. 
Shortly after the discovery of the source, VLBI observations ( Porcas et al. 19811) revealed 
similar core-jet structure in the radio components A and B, reinforcing the lensing hypothesis 
for the system. Gorenstein et al. (1988a) obtained improved A13 cm VLBI maps and resolved 
the A and B jet images into three Gaussian components each. They were thereby able to 
construct the magnification tensor [Mra] relating the A and B images. Implicit in their 
calculation of [M^a] was the assumption that the tensor remains the same both in the QSO 
core and in the jet region. |G94| have recently obtained even more accurate VLBI maps of 



0957+561 at A18cm and have been able to measure variations of [Mba] along the jet. 
I I 

G94 fit the A and B VLBI images of 0957+561 with six Gaussian components each, 
denoted as A\„$ and Bi_q. A\ and B\ correspond to the respective core components while 
A 2 ...6 and B 2 ..s are successive blobs in the jet. The relative positions of the brightest jet 
components, A 5 and B 5 , with respect to the core components, A\ and B\, are measured to 
within 0.1 mas (Table [I]). In our lens modeling, we use these two image offsets as constraints. 
Because |G94| do not give the separation between images A and B, we take as {A\ — B\) the 
value (-1*25271 ± 0'.'00004, 6'.'04662 ± 0"00004) reported by Gorenstein et al. (1988a). 

The improved resolution of the A18 cm VLBI map allowed G94 to measure the change 
in the relative magnification tensor along the axis of the jet. This gradient is effectively 
measured between components 1 and 5 in the two images. Because of their limited signal to 
noise, |G94| were forced to set two of the magnification tensor gradient components to zero, 
and they evaluated the gradients only of the other two components. Thus, |G94| provide a 



total of six constraints on the magnification tensor, viz. four matrix elements corresponding 
to the transformation from A§ to B§, plus gradients of the two matrix eigenvalues along the 
long axis of the jet. The six constraints are summarized in Table 0. 



3.2. Third Image Flux 



Although gravitational lensing theory predicts a third image of 0957+561 near the center 
of the lensing galaxy, no such image has been seen down to a 5a limit of 0.6 mJy, which is 1/30 
of the flux density of image B (|Gorenstein et al. 1984|) . This flux limit provides an additional 
constraint on the lens model. Because there is some ambiguity in the non-detection of the 
third image flux ( |Gorenstein et al. 1983| ; |Gorenstein et al. 1984| ; Ponometti 1985| ), we have 
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chosen not to treat this constraint in the standard fashion. We instead adopt a weighting 
function that assigns no penalty to a lens model if it predicts a third image flux below the 
5cr detection limit, but where the penalty increases steeply if the predicted flux is greater 
than this limit. Thus, if C/B is the flux ratio between the third image and the B image 
according to a lens model, we take its contribution to the x 2 to be 

f : C/B < 1/30 

X 2 C/B = {C/B - 1/30) 2 . (3-1) 
I (I/ISO) 2 • ° /B > 1/30 

This unorthodox penalty assignment is a compromise between what we view as two unre- 
alistic extremes. On the one hand, treating the 1/30 flux ratio upper limit as a true 5cr 
contribution to the overall \ 2 excessively penalizes models for which the third image flux is 
say only at the 2 or 3<r level and thus would likely have been dismissed as noise in the large 
area scanned for the third image. On the other hand, treating the 1/30 flux ratio as a la 
penalty allows the model too much freedom to produce unrealistically bright third images. 
Although the particular \ 2 assignment given above results in non-Gaussian uncertainties in 
the model parameter values, we feel that this penalty function is the fairest representation 
of the unclear observational status of the third image. We emphasize this point because 
different choices of the third image penalty lead to significantly different model limits on the 
Gl core radius. However, there is very little effect on our results for the Hubble constant. 



3.3. Position of the Gl Center of Mass 

Although the angular separation of the A and B images in 0957+561 is known extremely 
accurately, the positions of these images with respect to the center of mass of Gl are not so 
well constrained (see Table |3|). Stockton (1980) reported an optical center of brightness for 
Gl with a 30 milliarcsecond (mas) uncertainty. VLA observations of the region ( |Roberts~ 



al. 1985|) revealed a point-like source G with 1 mas uncertainty in the position. However, 



VLBI mapping of the same region flGorenstein et al. 1988a| ) found no source coincident with 



the VLA detection, but did detect a weak (0.6 mJy) point source G' some 30 mas away, 
again with 1 mas uncertainty. Both the VLA and VLBI sources are consistent with the 
optical center, but they are inconsistent with each other by many standard deviations. 

In view of the discrepancy between the VLA and VLBI detections, we take the optical 
center of brightness of Gl and its error bars as the reference for the model fitting. Note that 
this only affects two coordinates, namely the two components of (Si— Gl), since all other 
image positions are measured as offsets with respect to either Bi or A\. Our treatment of 
the galaxy center differs from that adopted by [FGS|, who selected the VLBI point source G' 
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as their center of mass position with 1 mas uncertainty. 



4. Lens Mass Models 

Most of our calculations are based on a five-parameter model of the 0957+561 lens 
system where we represent the cluster as a simple astigmatic deflector and the galaxy as 
a power-law deflector with a core radius. We also test the particular five-parameter model 
used by FGS, who have the same form for the cluster contribution, but represent the galaxy 
as a King potential with an additional compact nucleus modeled as a point mass. We shall 
describe the lensing properties of these two models in detail below. 



4.1. The Softened Power-Law Sphere 

Our primary mass model for the lensing galaxy is an extension of the power-law mass 
distribution, M(r) oc r v , used previously in the modeling of lens systems flWambsganss fc 



Paczynski 1994Q . We describe the galaxy by the following spherically symmetric volume 
density profile, 

/ J2\(.V-3)/2 

p{r) = p \l + -\ , (4-1) 
where r c = 6 c Dd is the core radius. For r ^> r c , the mass of the galaxy varies as 

dM ( r \ ( r ? -3 ) 

— _ = 4npr 2 f» 4np r 2 — oc j** 7 , (4-2) 

dr \r c J 

which corresponds to M(r 3> r c ) oc r v . The parameter r\ is restricted to lie between r\ = 0, 
which corresponds to a modified Hubble profile (see below), and rj = 2, which corresponds 
to a constant surface mass density sheet. This family of softened power-law sphere (SPLS) 
models includes the singular isothermal sphere, which corresponds to r c = 0, r\ = 1. 

Although the SPLS density profile does not yield an analytic potential ip(r) or included 
mass M(r), the deflection angle does have an analytic form. To show this, we first obtain 
the surface mass density profile implied by equation ||4- 1 ] : 



where B is the standard Euler beta function. This surface density is a generalization of the 
modified Hubble profile (cf. [Binney fc Tremaine 1987]) , which is the specific case rj = 0. We 



- 9- 



then find that equation | 4-3|| yields an analytic projected mass M(£) of the form 

? 2\ »7/2 



M(£) = 2vr jf* E(£')£' ^' = M) u + ^ 



M =^js r c 2 . (4-4) 



Combining the deflection formula (eq. ||2-7|| ) for a spherically symmetric deflector with equa- 
tion ||4-4j| , we then obtain the deflection function ct(0): 



»(«) = I i 



1+ « 



61. 



/4GM V /2 
V c 2 L> 



(4-5) 



For our model-fitting we adopt an equivalent but more convenient expression for the 
deflection function (eq. ||4-5|| ) which remains well-behaved even when r c — > and the central 
density po diverges: 

0, a E = a 2 /{2 ~ v) 6-^ 2 -n\ (4-6) 

The parameter «e represents an approximate Einstein radius of the galaxy, insofar as 
«(«e) = «e for 9 C = 0, and a(a E ) ~ «e for 9 C ^ 0. 

Note that the basic SPLS galaxy model described here has three adjustable parameters: 
OLE-, 9 C , V- F° r the particular calculations described in §[5.3| we add a point mass at the center 
of the galaxy to investigate if the additional freedom would allow us to obtain a better fit of 
the observations. In these cases, the mass model has four parameters. 




V 



4.2. The FGS Galaxy Deflector 

In this model the lensing galaxy consists of a smooth, circularly symmetric, King-type 
surface density profile parameterized by its angular core radius 9 C and velocity dispersion 
a v . The deflection function employed by FGS for this profile is an analytic approximation 
introduced by Young et al. (1981): 

d(0) [radians] = (^j ^ (4-7) 

a«,(0) = 53.2468/(1.155- -44.0415/ 0.579- , 



/(*) 



:i+x 2 ) 1/2 



X 
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This formula reproduces the deflection of a King profile with cutoff radius 6 t ~ 600# c and is 
accurate to ~1% within 106 c of the center. Because 0957+561A lies ~ 6" from the center of 
the galaxy Gl (we will hereafter refer to the point at the center as Gl), model-fitting with 
this approximation is reliable only for 9 C ^ 0'.'6. 

FGS found that a pure King-type galaxy has too "soft" a potential to reproduce the 
observed magnification ratio between the A and B quasar images satisfactorily. Furthermore, 
the potential does not sufficiently demagnify the third image near the galaxy center. For 
both reasons FGS added to their mass model a compact nucleus at Gl. This component is 
modeled as a point-mass in the lensing equations, with a deflection law given by 

"(0) = 0, (4-8) 

where a^h is the Einstein radius of the compact nucleus. The Einstein radius is related to 
the mass Mbh of the compact nucleus via 

fAGM hh \ 1 ' 2 „ / M bh \ 1/2 / D Y 1/2 , v 

«bh= — ^ =0"94 — ¥-) — — . (4-9) 



V c 2 D J ' \10 U M Q J \ 1G P C , 

Although treated as a point mass, the compact nucleus postulated for the 0957+561 lens 
need only have a radius small in comparison with the ~ 1" separation between image B and 
Gl. 

As in the case of the SPLS model described earlier, the FGS model again has three 
adjustable parameters: a v , 9 C , Mbh- 



4.3. External Sources of Deflection 

The lensing galaxy in 0957+561 is a member of a cluster of galaxies, and because of the 
large angular separation of the A and B images it would appear that part of the deflection 
in this lens is produced by the cluster. Following FGS, we model the cluster deflection by 
means of a convergence k and shear 7 with position angle (p. Such a model is reasonable so 
long as the projected mass density of the cluster is relatively constant over the angular scale 
of the image separation. We consider the effect of deviations from this model in §|5T5". 



The convergence k is the ratio of the local mass surface density of the cluster to the 
critical density, S cr = c 2 D s /47iGDdDd s ( Blandford fc Narayan 1992[) . The angular deflection 
due to a constant convergence takes the simple form ol(0) = n6. Since this corresponds to 
isotropic focusing, there is a degeneracy in lens models which was noted by Falco et al. (1985). 
Given any model of the lensing galaxy which fits the observations, it is possible to find a new 
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model where a constant convergence k of arbitrary magnitude is included and the mass in the 
galaxy is at the same time scaled down by a factor (1 — k). For the particular galaxy models 
described earlier, the parameters which are modified are: a^h i— > o^ h (l — k), a| i— > a|(l — k), 
<t| i— *■ cr^(l — k). All observables are invariant under this transformation, except the time 
delay which scales as (1 — k). The effect on the time delay translates to a change in the 
derived Hubble constant: H i— ► H (1 — k). Since the surface mass density of the cluster has 
to be a positive quantity, we have the constraint k > 0. 

Because of the existence of the simple tranformation described above, it is not necessary 
to include k explicitly in the model. The shear, however, must be included. It transforms 
with variable k as follows: 7 1— > 7' = 7/(1 — k). Therefore, in effect, it is the scaled shear 7' 
which we fit in our models. The deflection law due to shear takes the form 

a(0)= 7 '[T(0)]0, where [T(0)] ee f ^ ^ 

\ sin 2(j) — cos 

Purely from measurements of the lensed images it is not possible to break the model 
degeneracy due to a variable k. However, direct measurements of the mass either in the 
galaxy or the cluster do offer the possibility of breaking the degeneracy. We discuss this in 
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4.4. Deriving H from a Lens Model 

To determine a value for Ho from 0957+561, it is necessary to compute the difference 
in light propagation time from the source to the observer along the two image paths. This 
quantity depends on the image positions and relevant redshifts as well as the mass distribu- 
tion of the lens. In the following discussion, we assume k = for simplicity, and include the 
relevant factor of (1 — k) only at the end. 

To obtain the relative time delay we first determine the delay of the path corresponding 
to a given image location Oi relative to the "unlensed" light path to the observer from the 
source at location f3. This takes the form 

(l + z d )D 2 (l + z d ) 
Ti = — \0i-p\ ~ z — V>\Pi), (4-H) 

where the first term is the geometric time delay and the second is the gravitational time delay 
due to the "Shapiro effect" (Shapiro 1964). The factor (1 + ^) accounts for the expansion of 
the universe since the deflection of the rays at redshift z&. The relative time delay between 
the two images is given by Ar^ = Tj — Tj. Given an SPLS mass model and a fitted source 
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position (3, we then have the following relation for At^: 



cAt, 



ij 



[l + z d )D 



- (3\ 2 - V (el cos 20 - #J cos 20 + 29 x 9 y sin 20 



a 



2-n 



0, 







«bh ln 



\0< 



(4-12) 



where 9 X and 8 y are the x- and y-components of angular position vector in our coordinate 
system. We have cast the time delay equation in the above form to show the separate 
dependencies on model parameters (on the right) and on the effective lens distance D (on 
the left). All other lensing variables appearing in this equation (zj, 0i, 0j) are observables. 
As the King-type galaxy approximation used by FGS has an analytic potential, the time 
delay equation for the FGS model has the following closed form: 



cAr, 



ij 



[l + z d )D 



\0-(3\' 



-0c 



i (e 2 x cos 20 - el cos 20 + 28 x 6 y sin 20 



27.6636 g ( 1.155 jpj - 45.6437 g fo.579^ 



0, 



9{x) 



~ «bh ln 



Vl + x 2 
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ln (l + Vl + x 2 ) . 



The dependence of the predicted Ar^- on H is embedded in D, as can be seen from 
equation [ |2-1| |. Thus, once the model parameters are determined with sufficient precision via 



model-fitting, we obtain an estimate of the quantity HqA^. A measurement of Ar^ then 



leads to an estimate of the Hubble constant. Actually, because of the k degeneracy 
the formula for Ar^- given above should be multiplied by the undetermined factor (1 — k). 
Therefore, technically we can only determine the quantity Hq Ary/(1 — k) from the model. 

It is convenient to define a dimensionless number /i 15 , which allows for all of these 
factors: 

-i u -iA _n ( 1-5 yr' 



H Q = lOO/ii.skms-'Mpc" 1 (1 



At., 



(4-14) 



Each lens model gives a unique value of hi.5. The value of Hq derived from this, however, 
depends on the measured time delay Ar^ and on the unknown magnitude of k. Since we 
know that k > 0, we see that the lens method can provide an upper bound for the Hubble 
constant even when there is no independent determination of k. 

Although the discussion here has focused on 0957+561, the same ideas are applicable 
to any lensing system in which (i) we know the source and deflector redshifts (in order to 
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express D in terms of H ), (ii) we are able to measure a time delay, and (iii) we obtain a 
well-constrained mass model for the lens. If we have a lens system with C quasar images, we 
can in principle measure (C — 1) independent time delays, all of which should be consistent 
with a single value for the Hubble constant. 



5. Results 

The various observations detailed in §|3| furnish a total of fifteen constraints on the mass 
model. Both the SPLS model and the FGS model fit these constraints with nine parameters 
- four source emission coordinates, two each for blob 1 and blob 5, and five variables 
specifying the lens mass distribution. We also study a variant of the SPLS model that 
includes a compact nucleus, for a total of ten parameters. The third image flux constraint 



discussed in § |3T2| only comes into play for models without a Gl compact nucleus. The 
predicted flux of the third image is zero when there is any substantial mass in the nucleus. 
Thus the third image flux upper limit does not influence the FGS model or the SPLS model 
with compact nucleus. Accordingly, we have (15 — 9) = 6 degrees of freedom for the basic 
SPLS model, (14 - 9) = 5 d.f. for the FGS model, and (14 - 10) = 4 d.f. for the SPLS 
with compact nucleus. We discuss the results from the SPLS model in §[TI] and those from 
the FGS model in § |5.2| . In § [5.3| , we discuss the results of adding a compact nucleus to the 
SPLS model. We also investigate the degree to which our SPLS model results vary if we add 
perturbations due to the ellipticity of Gl (§ |5.4|) and the influence of nearby cluster galaxies 

(§PD- 



5.1. Fitting to a Softened Power-Law Sphere 

5.1.1. Goodness of Fit 

We employed the AMOEBA non-linear minimization algorithm (Press et al. 1992a), 
based on the downhill simplex method, to optimize the lens model by minimizing the y 2 
of the fit. Our best-fit model gives y 2 = 26.0 for a chi-square per degree of freedom of 
X 2 = 26.0/6 = 4.3. In Table |] we compare the model predictions of the various observables 
with the measured data. In general, it is seen that the model agrees well with the data; most 
deviations are under la and the largest discrepancies are under 2a. However, despite this 
good agreement, the formal x 2 value is quite large. This is primarily the result of the very 
large correlations which |G94| quote between the errors on the various observables (Table |l|). 



For instance, if we ignore the correlations and define x to be the straight sum of the squares 
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of the values in the last column of Table [|, then our fitted parameter set gives x 2 = 11-2, 
corresponding to x 2 = 1.87. Throughout this paper we always include the correlations when 
calculating x 2 ■ |G94j noted that the value of M 2 predicted by the model of |b'GS| is discrepant 



with the value which they measured. Our best-fit SPLS model has similar difficulty — the 



predicted M<i is almost two standard errors below |G94j 's value 



An interesting feature of Table [| is that almost all of our model x 2 from fitting the 



image positions arises from the (Gl — Bi) constraint. As noted in § |3.3| , we have adopted 
the larger (30 mas) error bars of the observed optical center of brightness flStockton 1980| ) . 
We find that the model takes advantage of this freedom in the center of mass position to 
better satisfy the other, much tighter constraints on QSO image separations from VLBI. Our 
best model (without compact nucleus) chooses a Gl center of mass some 62 mas from the 
measured optical center Gl, 44 mas from the VLBI source G', and 64 mas from the VLA 
source G. Clearly this separation would have been prohibitive if we had used the 1 mas error 
bars of the radio sources as the uncertainty in Gl center of mass position. 

We experiment with forcing the galaxy center of mass to be coincident with G' to the 
1 mas precision of VLBI. This was the procedure of [FGS| in their analysis. In this case 



the SPLS model fits very poorly to the data, never obtaining a x 2 below 442 (x 2 > 73). 
The contribution to the overall x 2 from the magnification tensor constraints only marginally 
increases (to 22.7, up from 21.5), with essentially the entire extra x 2 coming from badly fit 
image positions. Our model has the most difficulty in simultaneously fitting both (G' — Bi) x 
and (Bz,—Bi) x , which are each almost 13 standard errors from the VLBI measured values and 
account for two-thirds of the image position x 2 ■ The FGS model (see § |5.2| ) fares better, but 
still has an unacceptably large x 2 value of 43. Adding a compact nucleus to the SPLS (see 
5\3|) achieves similar results, x 2 = 48. In both cases the x 2 contribution from magnification 



constraints is 60%, as compared with (22.7/442) « 5% for the basic SPLS. 



5.1.2. Fitted SPLS Model Parameters 

In Table |5] we show the best-fit SPLS model parameters and their associated 95% (2a) 
confidence limits. Because of our poor x 2 , we have estimated the 2a bounds such that they 
correspond to Ax 2 = 4% 2 rather than A% 2 = 4. 

Of particular interest are the limits on 6 C and 77. The best-fit core radius is zero, and 
models with core radii in excess of O'.'ll are excluded at the 2a level. At the lens redshift 
of 0.36, this corresponds to a 2a upper limit of 330/i _1 pc on the linear core radius of the 
lensing galaxy. The limit on 8 C arises primarily from the limit on the flux of the third image 
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:/. |Wallington fc JNarayan 1993 ). The predicted third image flux increases rapidly with core 



radius, exceeding the 5cr detection limit of ||Mob|| < 1/30 for 9 C > 63 mas, or r c > 190/i 1 pc. 

The best-fit value of the radial power law index r\ is ~ 1.16, which makes the density 
profile of the lens slightly shallower than isothermal [r\ — 1). If 77 is pushed lower, the 
model compensates by raising the core radius. This is only effective down to 77 ~ 1.10, at 
which point the core radius becomes large enough that the third image flux exceeds the 5a 
detection limit. Because of this, an isothermal profile is ruled out quite strongly. Our best 
"isothermal" model has A% 2 = 84.3, which is unacceptably large. In the other direction, we 
find that the fit degrades rapidly for 77 > 1.17, with 77 > 1.20 excluded at the 6a level. 

We illustrate the bounds on 8 C and 77 in Figure [l], where we display contours of constant 
X 2 corresponding to best-fit models with fixed values of these two parameters. The most 
striking feature of the figure is the fact that the range of models which fall within the 2a 
contour is limited to a narrow valley covering quite a small range of the two parameters. 
This is despite the fact that we have conservatively defined the 2a limit as A% 2 = 4% 2 . In 
other words, the 0957+561 observations do an excellent job of constraining the parameters 
of our mass model despite the large \ 2 f° r the best-fit. Even fairly small deviations in the 
model parameters about their optimal values give a far worse fit to the data. The reason 
is that many of the data constraints have extremely small quoted errors so that the model 
has to be just right even to fit within 2a. The large correlations between the magnification 
matrix elements quoted by G94 (Table [l]) make the problem more acute. 

Note from Figure [| that the best-fit model has 9 C = and therefore lies at one edge of 
the allowed parameter space. Because the surfaces of constant \ 2 are truncated at this edge, 
we do not have gaussian-distributed uncertainties for our parameter estimates. Generally, 
most of the model parameters are tightly constrained to one side of their best-fit value and 
much less constrained in the opposite direction, corresponding to the narrow \ 2 "valley" 
seen in Figure [l]. Despite the non-gaussian errors, we have chosen to use A\ 2 = 4% 2 as our 
definition of the "95%" confidence limits for the parameters. These are the values listed in 
Table |. 



5.1.3. Implications for Hq 

Given a set of fitted model parameters, we can estimate the dimensionless Hubble pa- 
rameter hi,5 as described in §£OJ. Figure shows contours of hi.5 overlaid on contours of 
X 2 - We see that the /ii. 5 contours are roughly parallel to the long axis of the valley of good 
solutions. Because of this, only a narrow range of values of h\s is allowed. 
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We quantify the limits on hi 5 as follows. Given a parameter set p with corresponding 
goodness-of-fit x 2 (p) an d derived Hubble parameter /ii.s(p), we solve for the function x 2 (^i.s) 
via the method of Lagrange multipliers. We obtain points on the x 2 (^i.5) curve by finding 
the parameter set p which minimizes the function 

i 71 (p;A) = x 2 (p) + A/i 1 . 5 ( P ) (5-1) 

for various values of the Lagrange multiplier A. Each choice of A generates a pair of values 
for x 2 an d which represent the least % 2 for the subset of parameter combinations which 
yield the given hi,*,. The results of this procedure are shown in Figure |3|. The best-fit model 
gives hi,s = 0.605, with the 2a interval given by 0.583 < h 15 < 0.658. Substituting this 



value into equation |4-14|, we then obtain 



H = (60.51^ km s^Mpc -1 



K 



1.5 yr 



(82.5j|;jj km s^ 1 Mpc" 1 ) (1 - k) (4^) • (5-2) 



BA 



5.2. Testing the FGS Model 

We have also tested the family of models used by FGS| and described in §fO| We obtain 
a best-fit x 2 of 28.4, corresponding to x 2 = 28.4/5 = 5.7. This is significantly worse than 
the x 2 = 4.3 fit we obtained with the SPLS model. 

In Table |B| we list the best-fit parameter values we obtain with the FGS model, and 
compare these with the values previously obtained by FGS| using their more limited data. 



For almost all parameters, our new estimates deviate significantly from the old ones. This 
is somewhat worrisome since it suggests that this lens model may not be very robust. Inci- 
dentally, if we use the original model parameters as given by FGS| with our data constraints, 
we obtain an extremely poor x 2 of 3.3 x 10 4 . 

Our estimate for h\.5 from the FGS models is 0.732, with 95% confidence limits given 
by 0.658 < hi.5 < 0.795. The 2a lower limit obtained here overlaps the 2a upper limit of the 
SPLS model and therefore the two models may be considered to be marginally consistent. 
However, between the two, the models span quite a wide range of h 15 . This is disturbing since 



it suggests that the data are still unable to constrain the lens model very well. pTK| made 
a similar point with a smaller data set. Another disturbing feature is that the value of hi.5 
suggested by our current best-fit FGS model differs considerably from the value hi.5 ~ 0.60 
which FGS obtained with their orginal fits. This is a reflection of the large changes in the 
parameters of the model as shown in Table 0. Once again it implies that the FGS model is 
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not well-constrained. One odd feature of the FGS model which must be mentioned is the 
extremely massive compact nucleus required with this model. At a mass of 110 billion M & 
(llOMg), this compact nucleus of the model is unlikely to represent a supermassive black 
hole at the center of the galaxy. It is unclear what this mass represents. 



5.3. Results for SPLS Models with Compact Nucleus 
Because |FGS| found that including a compact nucleus at the center of Gl greatly im- 



proved their fit to the data, we have tested modified SPLS models where we add a compact 
central nucleus of mass Mbh. For the expanded SPLS models with compact nucleus we find 
that the best fit model has M bh = 27M 9 and \ 2 = 22.1, for x 2 = 22.1/4 = 5.5. This 
reduced x 2 is similar to the x 2 = 5.7 we obtain with the FGS model-fitting. We see that 
both models with a Gl compact nucleus do a significantly worse job than the basic SPLS. 
In Table [F] we list the parameter values of the best-fit SPLS with a compact nucleus. These 
may be compared with the parameters of the basic SPLS model (Mbh = 0) listed in Table 
[|. We see that r\ has increased from 1.16 to 1.38, so that the deviation from isothermality is 
larger. Most of the other parameters are almost unchanged. The derived Hubble parameter 
is significantly smaller, however, at hi 5 = 0.502. 

As we increase the mass of the compact nucleus beyond the biest-fit 27Mg while op- 
timizing the remaining parameters, we find that the x 2 increases very slowly, not reaching 
Ax 2 = X 2 until Mbh ~ HOM9. Over this range, the core radius steadily rises and the power- 
law exponent drops toward zero. Beyond M « II8M9 the x 2 becomes worse very quickly. 
At Mbh ~ 125Mg, the model can do no better than x 2 — 100. The transition happens at 
the point where the Einstein radius «bh of the nucleus becomes comparable to the angular 
separation between Gl and B. Models with more massive nuclei have great difficulty fitting 
the B image and therefore give large x 2 ■ Near the limiting mass, the core radius of the 
galaxy becomes large, comparable to the Gl-B separation, and the index 77 approaches zero. 

Table |7| gives parameter values corresponding to a few specially selected SPLS models 
with compact nucleus. In addition to the best-fit model which we have already discussed, 
we show the best-fit isothermal (r/ = 1) model, where M b h = 78.8M 9 , and an FGS-like SPLS 
where we fix Mbh equal to the optimum value obtained with the FGS model. Generally, we 
find that up to Mbh ~ 50M 9 , the core radius is zero and r\ increases. Above this mass, r\ 
starts decreasing and the core radius goes up. The variations are particularly rapid as Mbh 
approaches the limiting value. 
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The derived values of the Hubble parameter shown in Table [7| reveal a similar behavior. 
Until M bh ~ 50M 9 , /i 15 decreases, going down to about 0.5. For more massive nuclei, h 15 
turns around, increasing almost to 0.75 near the FGS mass. An interesting result is that 
the value of we obtain with the modified SPLS model is very similar to that with the 
FGS model at the same mass of the nucleus. Therefore, the question of whether or not to 
take seriously the value of hi 5 = 0.75 obtained with the FGS model boils down to whether 
or not a nucleus with a mass of 110 billion M & is reasonable. We ourselves find this mass 
uncomfortably large. A more reasonable nucleus of a few billion M@, corresponding say to 
an AGN-like supermassive black hole, gives results very similar to those of the basic SPLS 
model described in §|5J] and therefore suggests hi.5 ~ 0.6. 



5.4. Considering an Elliptical Gl Mass Distribution 

All of the above models for the 0957+561 lens system assume azimuthal symmetry of 
the galaxy Gl. While azimuthal symmetry is expedient for calculations, there is evidence 
even in the early observations of Gl by Young et al. (1980) that the galaxy has elliptical 
isophotes. |BTK| disputed the Young et al. ellipticity measurement of e = 0.13, suggesting 
that contamination from the inner quasar image led to an underestimate of the true ellipticity. 



From their own observations, |BTK| reported a more substantial ellipticity e ~ 0.30. Both 
studies agreed that the position angle of the Gl major axis is 55° east of north. If the 
Gl mass distribution is as flattened as the isophotes, one might worry that the poor x 2 of 
the various models discussed above stems from their common assumption of Gl azimuthal 
symmetry. In addition, we would like to be assured that the derived H for the SPLS model 
is not significantly affected if the Gl mass distribution includes some ellipticity. 

We address these concerns by fitting the data to an elliptical variant of the SPLS. 
The most straightforward adaptation of our spherical mass distribution to an elliptical mass 
distribution replaces equation ||4-3|| by 



£(£, (p) = S 1 1 + J- [1 - e cos 2(<p - <p e )] J , (5-3) 

where ip is the position angle on the sky with respect to the center of Gl. The circular 
isodensity contours of the SPLS are now replaced with concentric ellipsoids having common 
asphericity parameter e and major-axis position angle (p £ . We refer to this model as the 
softened power-law homoeoid (SPLH). In order not to introduce additional free parameters 
into our fitting, we fix the ellipticity and position angle of the SPLH to match the isophotal 
ellipticity and position angle of Gl as observed by [BTK . 
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Unfortunately, there are very few instances in which the deflection function for elliptical 
mass distributions can be expressed in closed form. Kassiola & Kovner (1993, hereafter |KK| ) 
pointed out that the softened isothermal (77 = 1) homoeoid is one such instance in which 
the lensing properties are analytic. We have found that the deflection function for singular 



0) power-law homoeoids may also be expressed in closed form. In § |A. 1| we provide a 



more detailed discussion of the lensing properties of these particular cases. As we noted in 



5.1.1] , the introduction of an adjustable radial power-law index 77 is the distinctive feature of 
our mass model that allows us to fit the 0957+561 system without a supermassive black hole 
and with a much improved x 2 over forced-isothermal models. We therefore hesitate to restrict 
ourselves to an isothermal homoeoid, as do Kormann, Schneider, & Bartelmann (1994) in 
their unsuccessful attempt to model the gravitational lens producing the quadruple image 
B1422+231. We also do not wish to restrict ourselves to only singular elliptical power-law 
mass distributions because we have seen that the data accomodate finite-core SPLS models 
(c/. Fig. [I]), even if our best-fit SPLS has a vanishing core radius. 

The only alternative that allows us to work with lens equations in closed form is to 
approximate the SPLH with a suitable elliptical potential model. As we show in §|A.2 



the tilted Plummer family of elliptical potentials possesses the appropriate parameters to 
approximate the behavior of the SPLH. The approximation is quite good for the 0957+561 
system because we are fitting Gl with a core radius much smaller than the galaxy-image 
separations, and we assume that the mass distribution is no more elliptical than the observed 



isophotal ellipticity e 0.30 mentioned above. [KK| have shown that for such small ellipticity 



an elliptical potential is a very accurate representation of the true potential of an elliptical 



mass distribution. We refer the reader to § |A.2| for a detailed explanation of the tilted 
Plummer potential approximation. 

Taking the Gl mass distribution to be as flattened as the surface brightness, e = 0.30, 
we find that the SPLH fits slightly better than the SPLS. We summarize the results of our 
model-fitting with an elliptical Gl in Table |8|. The SPLH gives x 2 = 3.8 in contrast to 
X 2 = 4.3 for the SPLS, both with six degrees of freedom. Almost all of the x 2 reduction 
comes from a better fit to the observed (Gl — B) y . The elliptical Gl does not help the 
problematic M2, which still remains 1.8 standard deviations below the value observed by 
G94| . Comparing Table § with Table ||, we see that adding ellipticity to the Gl density 



profile makes little change to the galaxy mass model parameters. In particular, the elliptical 
model best-fit core radius 9 C <C 1 mas, Einstein radius «£ = 2'.'51, and power- law exponent 
77 = 1.157 are all well within the 2a confidence limits of the SPLS fitting. While the best- 
fit scaled shear remains exactly the same at 7' = 0.224, we find that the optimal shear 
position angle rotates substantially to = — 76?9. This value is far outside the 2a bounds 
-65?1 < < -63?3 we obtain when fitting to the SPLS. 
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These results are not entirely surprising, as we might expect the shift to an elliptical 
Gl to have the most impact upon the fitted external shear and position angle. Just as 7 
compensates for the shear induced by the cluster, so too does it compensate for shear caused 
by the lensing galaxy and not adequately reproduced by a SPLS model. Now introducing a 
separate ellipticity due to Gl, we find that the external shear adjusts so as to model only 
the cluster shear, but to first order none of the Gl parameters are affected. Of course, we 
have assumed that the mass ellipticity of Gl is the same as its isophotal ellipticity. If the 
mass is significantly more distorted, then we expect bigger changes. 

Despite the significant shift in external shear direction, the best-fit elliptical Gl model 
gives a Hubble parameter of hi^ = 0.616. This is a deviation of less than two percent from 
the hi.5 = 0.605 we obtained with the SPLS, and well within the 2a SPLS confidence interval 
of 0.583 < hi.5 < 0.658. We therefore conclude that asphericity of the mass in Gl does not 
significantly alter the results we have obtained by fitting an SPLS, particularly with regard 
to the derived Hubble constant. 



5.5. Considering Perturbations to the Cluster Model 

Up to this point, all of our 0957+561 lens models have assumed that we may add to 
the potential of Gl a locally quadratic potential due to the cluster, characterized by fixed 
convergence, shear, and shear position angle. As we have seen in § |5.4| , this "external" 
quadratic potential may also compensate for failings in our particular choice of the Gl mass 
distribution. Even if we were to have chanced upon the perfect parameterization of the Gl 
potential, we may nonetheless obtain a poor match to the observations if there were too much 
"dumpiness" in the local cluster potential around Gl for our quadratic external potential 
to handle. This possibility is heightened by the large (~ 6") separation between the two 
images of 0957+561. Assuming for the moment that the cluster dark matter potential is 
locally quadratic about Gl, we may then ask if there are other galaxies sufficiently close to 
Gl that their differential lensing properties from image A to image B might cause problems. 

Angonin-Willaime, Soucail, & Vanderreist (1994) surveyed the 0957+561 region, ob- 
taining photometry complete to R = 24 in a 4'.5 field as well as spectroscopy of 34 galaxies 
in a 6' field. Their galaxy redshifts confirmed the existence of a cluster at mean redshift 
z = 0.355 containing Gl, which had been suggested by earlier redshift measurements from 
Garrett, Walsh, & Carswell (1992). From a sample of 21 member galaxies, Angonin-Willaime 
et al. concluded that the cluster containing Gl is extended and poor, with a large (> 50%) 
spiral fraction. Of that sample, only two galaxies (numbered 20 and 21 in Table 2 of Angonin- 
Willaime et al. 1992) are located within 30" of either 0957+561A or B. Both galaxies, here- 
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after referred to as G20 and G21, are faint ellipticals at redshift z = 0.355. Both are also 
sufficiently near the quasar images for concern about their contribution to higher-order terms 
in the expansion of the potential about Gl. The nearest of the two is G20, with magnitude 
R = 20.69 and offset from 0957+561B of Ax = 7"69, Ay = 2791. G21 has magnitude 
R = 21.37 and 0957+561B offset of Ax = 11705, Ay = -2755. 

In order to test what effect these nearby galaxies might have on our results, we include 
them as singular isothermal spheres. The singular isothermal sphere is an attractive candi- 
date not only because of its simple lensing properties, but also because of its relatively slow 
density falloff at large radii. This allows us to be conservative in estimating the perturbative 
effect of G20 and G21 on the lensing at Gl. To assign isothermal velocity dispersions a 
to the galaxies, we use the Faber- Jackson relation L = L*(cr /er*) n (Faber & Jackson 1976). 
Following Kochanek (1993) we adopt the values n = 4 and o~* = 245 km s _1 appropriate for 
E/SO galaxies. As noted by Kochanek, 245 km s -1 is at the high end of the estimated o~* 
range of 183-248 km s" 1 from dynamical estimates. Therefore we are being still more con- 
servative with regard to the possible degree of lensing perturbation from the nearby galaxies. 
We arrive at isothermal velocity dispersions of 193 km s" 1 for G20 and 165 km s -1 for G21. 
These may be compared with the Gl value of 330 km s _1 given in |FGS], also obtained with 



the Faber- Jackson relation. The corresponding Einstein radii are 07646 for G20 and 07472 
for G21. We point out that these Einstein radii are small fractions of the galaxies' distances 
from the 0957+561 images — less than 10% for G20 and less than 5% for G21. 

We summarize in Table |8| the results of our model-fitting with a perturbed cluster. 
The additional galaxies bring the reduced chi-square of the fit down to 3.4, a significant 
improvement over the original SPLS and slightly superior to the SPLH. Better fitting of the 
quasar image positions is responsible for almost all of the x 2 reduction, as is the case in our 
study of Gl ellipticity (see §|5.4|). Similarly, we find little improvement in either the overall 



magnification \ or the persistent discrepancy between model-predicted and observed M 2 . 

The best-fit Gl Einstein radius and cluster scaled shear are both down by ~ 10%. 
The reduction of the Einstein radius is because the the two external galaxies contribute an 
effective mass density, or convergence k, in the vicinity of the lensed images. As we have 
already discussed in § |4~3"1 , any external n leads to a corresponding reduction in the mass of 
the primary galaxy Gl. The magnitude of the shear decreases partly for the same reason and 
partly because the external galaxies themselves produce some of the shear needed to explain 
the geometry of 0957+561. The shear position angle rotates by a significant amount (to 
— 61?0), again indicating that the external galaxies have absorbed a fraction of the required 
shear. The Gl core radius and power-law index show no significant change, implying that 
these parameters are quite robust. 
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The best-fit Hubble parameter is 0.582, a drop of some 4% from the 0.605 value 
obtained using the plain SPLS. This shift is compatible with the 2a uncertainty we quoted 
for the SPLS model. The fact that H goes down rather than up is easily explained by 
the fact that the two external galaxies have introduced an effective convergence into the 
model (see §fOf). In conclusion, there are no qualitative surprises from these calculations. 
Quantitatively, we find that the two nearest galaxies in the cluster have insignificant effect 
on our estimate of the Hubble constant. 



6. Eliminating the Cluster Degeneracy 



The focus to this point has been to show how recent VLBI observations of the lensed 
images 0957+56lA,B limit the range of possible lens mass models, thereby limiting the 
model-dependent uncertainty in the determination of H using this system. Quantitatively, 
the result is expressed in the bounds on the Hubble parameter /i 15 given in the previous 



section. In order to obtain Hq, we see from equation ||4-14|| that we need a measurement of 
the relative time delay Atba and a determination of the cluster convergence k. Considerable 
work has gone into the former (e.g. |Vanderreist et al. 1989| ; |Schild 1990| ; [Lehar et al. 1992| ; 
Press et at. T5WI^ , |Schild fc Thomson 199$ [Pelt et al. Egg [Pelt et al 13551 ), and it is only 
a matter of time before a precise (±2%) value of At B a will be settled upon for 0957+561. 



Constraining the cluster convergence is a more difficult problem. As described in § |4.3| , the 
factor (1 — k) in the Hq equation |[4-14|| cannot be eliminated purely by observations of the 
lensed images. However, as Falco et al. (1991) showed, it is possible to estimate this factor 
by measuring the velocity dispersion of the lensing galaxy. We apply this method in § |6.1| 
below. Alternatively, a measurement of the velocity dispersion of the cluster may be used 



6.1. Velocity Dispersion of the Lensing Galaxy Gl 

The Falco et al. (1985) degeneracy arises because all image observables are unchanged 
if the lensing galaxy mass is lowered by a factor (1 — k) and replaced by a mass sheet with 
convergence k. However, when such a transformation is made, the velocity dispersion of the 
galaxy will be lower than in the original model since the galaxy now has less mass. Turning 
this around, a measurement of the velocity dispersion of the lensing galaxy allows us to 
normalize the lens mass model and thereby constrain k. Because the mass of the galaxy 
scales linearly with the square of the line-of-sight velocity dispersion, (vf os ), we may rewrite 
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equation | 4-14 | as 

Ho = (lOO/n.skms^MpcT 1 ) ( ^1°^ ) ( ^ ) , (6-1) 




where (f los ) mod is the expected velocity dispersion of the model lens galaxy in the limit that 
the surrounding cluster has zero convergence. 

Falco et al. (1991) applied this method to obtain their result of 

H = (60 km s^Mpc- 1 ) ( p-^-) " ( . 

V y ) ^390 kms- 1 / \1.5 yr/ 

Our analogous result from fitting the FGS model to the new VLBI data (§ |5.2|) is 
iJ = (73kms- 1 Mpc- 1 ^ ° v ^ ( A ™\ ' 



,341 kms V \1.5yry 

In practice, one does not measure the dark matter velocity dispersion a v , but rather the 
velocity dispersion (tf os ) of the luminous matter in the lensing galaxy, which need not be 
equal to a 2 . Thus it is dangerous simply to take a measured Gl velocity dispersion and 
substitute it for a 2 in the above equations. 

In the following calculations with the SPLS model, we use the virial theorem to estimate 
the model stellar velocity dispersion (vf os ) mod directly, for use in equation ||6-1|| . For notational 



convenience we define a 2 = (vf og ). In order to obtain a dispersion estimate that most directly 
compares with observation, our calculations below account for possible anisotropic orbits of 
the stars in the galaxy and also for the finite aperture of the slit used in the velocity dispersion 
measurements. 

By taking a suitable moment of the Jean's equation, Kochanek (1993) has shown that 
the line-of-sight velocity dispersion of the stars in a spherical galaxy satisfies 



G 



rv(r)M(r) dr 



C = It) djL 7^ ' (6 " 2) 

•-> ' I r u(r) dr 



o 



where M(r) is the enclosed total mass at radius r, and u(r) is the volume luminosity density. 
Note that equation ||6-2|| explicitly allows for the possibility that the luminous matter may 



have a different distribution than the total mass. For the SPLS lens model, M(r) is given 
by 

M(r) = 4vr J r' 2 p(r') dr' = 4vrp jT M + T — J r' 2 dr' 

4tt , /3 3-ri 5 r 2 \ . . 
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where 2-^1 ( a > b, c; z) is the hypergeometric function (cf. Abramowitz fc Stegun 1964 ). We 
numerically evaluate 2-^1(0, b, c; z) by making use of the fact that the function solves the 
hypergeometric differential equation 



*(1 - z) — = a&F - c - (a + 6 + 1)* 



(6-4) 



To obtain v(r) we start with the observed surface brightness profile of Gl, which is 
well- fit by a de Vaucouleurs profile with radius R e = 4'.'5 ± 0'/64 ( |B1'K| ) . We then take 
u(r) to be the spherically symmetric function which produces the de Vaucouleurs (1948) 
surface brightness function I(R) = J e exp{— 7.67 
Abel inversion (it cf. Binney &; Tremaine 1987| ) of t 
to 

dI(R) 



u(r) = -- [ 

7T Jr 



(R/R e ) l l A — 1 j. We compute v{r) via 
re de Vaucouleurs profile I(R) according 

dR 



dR y/R 2 -r 2 ' 



(6-5) 



Two further issues need to be considered before using equation [p-2|| to estimate cr mod . 



First, the derivation of equation ||6-2|| assumes that the entire luminosity distribution is 
sampled — the integrals of equation ||6-2(| are evaluated out to infinite radius from the galaxy 
center. In practice, a velocity dispersion measurement is taken through a narrow-slit mask, 
and thus gives disproportionate weighting to the central region of the galaxy luminosity 
distribution. Kochanek (1993) addresses this issue and gives a corrected form of equation fH] 
H taking into account finite-aperture effects. We have applied Kochanek's method assuming 
that the velocity dispersion of the galaxy is measured with a long slit of angular width 1". 
The second issue concerns the degree of isotropy of the stellar orbits. Whereas equation ||6-2|| 
when integrated out to infinity is valid regardless of the shapes of the orbits, the predicted 
dispersion measured through a finite aperture varies with orbit anisotropy. Kochanek (1993) 
has considered this issue as well and has presented results for a stellar system with a constant 
anisotropy factor, q, relating radial and tangential velocity dispersions (ofo *) of orbits at 
any given radius: 

(6-6) 



:i-<zK(r) 



Assuming isotropic (q = 0) orbits in Gl and a 1" slit aperture, we find that the best- 
fit SPLS model without compact nucleus (Table ||) predicts cr mo d = 321.7^ km s" 1 , with 
95% confidence limits. We show in Figure ^ a map of the predicted velocity dispersion as 
a function of core radius and power- law exponent. From this plot we see that contours of 
constant er mo d are remarkably parallel to the contours of x 2 , so that <7 mo d varies by only 
~ ±1% over the range of allowed lens model parameters. A somewhat larger uncertainty is 
present if we allow for anisotropic orbits. Figure |5] shows x 2 versus <7 mo d corresponding to 
three values of q: —0.2, 0, 0.2. We find that the shape of the curve is virtually unchanged 
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over this range of anisotropy, but the best-fit velocity scales approximately as (1 — g) ' 14 . 
If we assume that the bright cluster elliptical Gl has \q\ < 0.2, the fractional uncertainty 
in cr^ od due to uncertainty in q is no more than 6%. This is reassuring, as we expect the 
uncertainty in the measured dispersion to be at least this high. 

Thus, we finally obtain the following result for the Hubble constant: 

- 82 - 5 -(32&) 2 (S) tas " lMpc "' < 6 - 7 > 

where we now have absorbed the model parameter uncertainty in <7^ od into the error estimate 
on the coefficient. We have also included (in quadrature) the aforementioned 6% uncertainty 
due to the unknown Gl anisotropy q. Equation ||6-7|| shows that with good measurements of 
the relative time delay and of the velocity dispersion of the galaxy in 0957+561, it is possible 
to obtain a comparatively precise determination of the Hubble constant. The main residual 
uncertainty would arise from our incomplete coverage of model space by restricting ourselves 
to the SPLS model. We discuss this issue in 57. 



6.2. Direct Measurement of the Cluster Potential 

In addition to measuring the velocity dispersion of Gl, we may also break the degeneracy 
between the lensing galaxy and the cluster by making direct measurements of the surrounding 
cluster. One approach is to measure the core radius 6 c i and velocity dispersion a c i of the 
cluster. Assume that the cluster potential <p c i corresponds to a softened isothermal sphere 
with the following simple form, 

where b c i is the critical radius of the cluster. If ( represents the angular separation of the 
0957+561 images from the cluster center, the expansion of equation | |6-8| ] at the galaxy center 
then yields a local convergence k of the form ( [Kochanek 1991| ) 



b d (C 2 + 291) , , 

K = dls - f 6-9 

This relation allows us to estimate k knowing the velocity dispersion of the cluster a c i, 
its core radius 8 c i, and the position of the cluster center. If we take a c \ = 600 kms -1 , 
6 d = 30", and ( = 25" as suggested by Rhee et al. (1995), we obtain k = 0.13. The present 
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uncertainty in these cluster parameters places the cluster convergence in the approximate 
range 0.1 <^ k <^ 0.2. Equation [ |6-9|| is valid only if the cluster is smooth and relaxed (see 



Alternately one may try to infer the cluster potential directly through its lensing effects 
on background galaxies. Kaiser & Squires (1993) showed that the shearing of background 
galaxies by a galaxy cluster may be inverted to recover a map of the projected cluster mass 
density, though subject to an overall degeneracy ([Schneider fc Seitz 1995| ) similar to the Falco 
et al. (1985) degeneracy discussed above. The degeneracy can be removed by measuring 
magnifications of background galaxies in addition to shear distortions ( [Broadhurst et al. 



|1995| ; |Bartelmann fc JNarayan 1995| ). Preliminary results using a variant of the Kaiser & 



Squires method have been obtained for the cluster in 0957+561 ( pahle et al. 199"3| ; Rhee, 
Fischer, & Tyson 1995) and it is hoped that more detailed information on the cluster potential 
will soon be available. Such studies will at the very least provide a check on whether or not 
the quadratic cluster potential model introduced by FGS and used in this paper is valid. An 
additional test would be to compare the scaled shear and its orientation as determined by 
our model with the direct estimates of these quantities from the cluster mass reconstruction. 



7. Summary and Discussion 

The main result of this paper is that we have developed a new and more general model 
of the lensing mass in the double quasar 0957+561. The model consists of two parts: (i) a 
three parameter softened power-law sphere (SPLS) mass model (§ [4.1| ) for the primary lens 
galaxy Gl, which is more general than previous models in that it allows for a variable power- 
law radial dependence, and (ii) a two-parameter model (§[4.3|) of the form proposed by FGS 
to describe the shear due to the surrounding cluster of galaxies. By using a larger set of 
data constraints than in previous analyses, and by making use especially of the recent VLBI 
observations of Garrett et al. (1994), we are able to constrain the parameters of our model 
quite tightly. The best-fit values of the five model parameters and their 95% confidence 
limits are shown in Table |^. 

We obtain quite a strong upper limit on the angular core radius of the galaxy. This 
limit is set primarily by the requirement that the model be consistent with the absence of a 
third image of the quasar near the center of the lensing galaxy. The limit on the linear core 
radius is 330/i _1 pc, which is similar to limits obtained by Wallington & Narayan (1993) and 
[KK| . The main difference is that our result refers to a specific galaxy whereas Wallington & 



Narayan and KK carried out a statistical analysis of the ensemble of lens galaxies. 
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We are able to constrain the radial index rj of the lens model (cf. eq. ||4-1|| ) to within a 
10% range. The allowed range of 77 is close to the isothermal value, 77 = 1. However, exact 
isothermality seems to be ruled out and our model suggests that the density falls more slowly 
than r~ 2 at large radius. The tight constraint on rj plus the fact that the model makes use 
of images located as far as 5" from the galaxy center makes this a fairly significant result. 
The light from the lens galaxy fits a de Vaucouleurs profile (BTK| , |Angonin-Willaime ei 



al. 1994j ), which falls off at large radii much more steeply than a mass-traces-light isothermal 
distribution. Our lens model therefore indicates a substantial amount of dark matter out 
to at least ~ 15/i _1 kpc from the center of the lens. A similar result was obtained by 
Kochanek (1995) for the lensing galaxy in the radio ring source MG1654, but out to a 
somewhat smaller radius. Using a different approach, but again based on gravitational 
lensing, Brainerd, Blandford, & Smail (1995) found evidence for isothermal halos in galaxies 
extending over 100 kpc from the center. These studies illustrate the unique advantages of 
gravitational lensing for studying the mass distributions of distant galaxies. 

In addition to the SPLS model, we also test an FGS-like model of the lensing galaxy 
which consists of an approximate King potential (with two parameters) plus a compact 
central mass. The idea here is to repeat the analysis of FGS using our larger data set to 
better constrain the model. The best-fit FGS parameters are shown in Table |6|, and turn 
out to be quite different from the values published by FGS. Adding more data thus seems to 
have modified the model significantly, suggesting that the parameter values determined by 
FGS were not robust. One reason could be that FGS had only one degree of freedom in their 
fit. Another noteworthy feature of the FGS model is that it requires an enormous central 
mass of ~ 1O 11 M . Such a large mass is unlikely to be a nuclear black hole and it is not clear 
exactly what it represents. As a result of the large central mass, the FGS model also has a 
large core radius in excess of the 0957+561B separation from the galaxy center. One of the 
advantages of the SPLS model is that it does not require a central mass. Nevertheless, in 
analogy with the FGS model, we try to see what would be the effect of adding a point mass 
to the SPLS model. The results are discussed in §^]3] and shown in Table |7|. In brief we find 
that the reduced x 2 worsens by 1.2 when a central mass is added, which suggests that the 
data do not favor the inclusion of such a mass. 

One troubling feature of our best-fit model is that the reduced x 2 is quite large, x 2 — 4.3. 
Perhaps our model, despite being more general than previous ones, is still too simple to fit 
all the data. Incorporating reasonable perturbations due to Gl ellipticity (§ |5.4| ) and nearby 
cluster galaxies ( §5.5| ) results in modest x 2 improvement, but never do we obtain x 2 < 3.5 
for any of our SPLS variants. Efforts are under way to map the 0957+561 cluster mass 
distribution using weak distortions of background galaxies ( |Khee et al. 1995| ). These studies 



should help to show whether an FGS-like quadratic potential is sufficient to account for the 



-28- 



cluster lensing local to Gl. If our cluster model is adequate, and we simply have not hit 
upon the correct shape for Gl, we must look to alternate mass models for the lensing galaxy. 
To this end, we plan to explore more general mass-traces-light models, such as the following 
circularly-symmetric, four-parameter model proposed by Tremaine (1995): 

al ( 7 -/3)A* 



E(i?) 



i + : r 

Ri 



(7-1) 



where Rb is the galaxy core radius. Surface brightness profiles of many galaxies are fit by 
this functional form with 1 ^ a ^ 2; 3^/3^4; and 7 <; 0.3. 

The poor fit we find with the FGS model is also troubling. Using the FGS model, we 
obtain a best-fit x 2 = 5-7, rather worse than the SPLS and much worse than the y 2 ~ 1.3 
quoted by FGS in their earlier fit to lower-resolution data. We are surprised that neither the 
SPLS nor the FGS-type family of models can accommodate the recent VLBI data of |G94. 



Why is x 2 large for all models tested? Part of the reason seems to be that some of the data are 
measured with extraordinary precision, e.g. the positions of the quasar jet emission blobs are 
known to 0.1 mas (Table [I]). The models are therefore penalized even for very small errors. 
In fact, had we adopted the Falco et al. (1991) convention of 1 mas VLBI positioning of the 
Gl center of mass, both the SPLS and the FGS models would have been grossly inconsistent 
with the data (§ |5.1.1|) . As noted in [FGS| , having the galaxy center of mass coincident with 
the VLBI source provides a natural explanation that the radio emission originates in the 
core of this bright cluster elliptical. Our current model has no explanation for this observed 
radio emission. Another reason for the large x 2 °f an our models is the large correlations 
quoted by |G94| among the elements of the 0957+561 relative magnification matrix and its 



gradient. These correlations (Table 0) cause even fairly good fits of the measurements (Table 
|J) to contribute large amounts to the overall x 2 ■ Further VLBI observations of the jets in 
0957+561 would be desirable both to confirm the present results and to improve the quality 
of the constraints. 

In view of the large x 2 values, we have been conservative in setting confidence limits on 
the various parameter estimates (§ |5.1.2| ) — we have taken the 95% limits to correspond with 
a x 2 increase of 4% 2 rather than 4. Despite this conservative approach, the SPLS mass model 
parameters are well constrained. Moreover, when we perturb the fitting, either by giving the 
Gl mass distribution a modest ellipticity corresponding to the observed Gl isophotes (§ |5.4| ) 
or by including the nearest observed cluster members explicitly in the mass model (§ |5.5| ), we 
see little change in most fitted model parameters and insignificant difference in the derived 
Hubble constant. 



The primary aim of developing a mass model for the lensing galaxy in 0957+561 was 
to use the model to estimate the Hubble constant H . Our main result is given in equation 
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5-2|| . The tight constraints that we obtain for the parameters of the lens model translate 



via equation ||4-12| to a correspondingly tight constraint on the relation for H in terms of 



the relative time delay Atba- The coefficient in H equation |5-2| has an uncertainty of only 



about ±5%. The allowed range of Hq is asymmetric with respect to the optimum value for 



reasons discussed in § |5.1.3| . Note that the results quoted in this paper correspond to a flat 
universe (S7q = 1)- Our derived value of H increases almost linearly with decreasing Q , 
with a total increase less than 10% for Q = 0. 



As equation ||5-2|| shows, 0957+561 could be used to obtain a useful estimate of H pro- 



vided the relative time delay Atba is measured with sufficient precision and the convergence 
k due to the cluster is estimated. Many years of work have gone into collecting the required 
data for estimating At B a, both in optical ( |Vanderreist et al. 1989| ; [Schild 1990| ) and radio 



( |Lchar et al. 1992] ) . Estimates of Atba have however varied, with two distinct values emerg- 



ing from different data sets and analyses: Atba ~ 410 days ( Schild & Thomson 1993 ; Pelt 



\et al. 1994| ; |Pelt et al. 19951) and Ar BA ~ 540 days (Press et al. 1992c|) . The two estimates 
individually have very small formal errors, so that they are seriously inconsistent with each 
other. Ongoing work is expected to resolve this problem shortly. Meanwhile, in this paper, 
we have explicitly presented our results according to both claimed time delays, using scaling 
factors (Ar^/Elyr) and (Ar^/l^yr) respectively. 

The factor (1 — ac) in equation [|4-14j| arises because of a degeneracy in lens models 
discovered by Falco et al. (1985) and Gorenstein et al. (1988b). These authors showed that 
any lens model can be modified by reducing the mass in the lens by an arbitrary factor 
and substituting a constant density mass sheet of appropriate convergence k. In such a 
transformation, all image observables except the time delay remain invariant. This means 
that a given set of lens observations cannot provide a unique mass model but rather a one- 
parameter family of models parametrized by k. Since the value of k modifies the predicted 
time delay, this unfortunately means that we cannot obtain a unique estimate of Hq from a 
given lens unless we independently estimate k. Note, however, that k must necessarily be a 
positive number since it is proportional to the surface mass density of the sheet. Therefore, 



we can always obtain an upper bound on H (Narayan 1991). From equation ||5-2|| we see that 
the 95% upper bound is Hq = 65 kms _1 Mpc _1 for Atba = 1-5 yr and Ho = 88 kms _1 Mpc _1 
for Atba = 1-1 yr- 

FGS showed that it is possible to eliminate the k degeneracy if we could measure the 
line-of-sight velocity dispersion (tf os ) = u 2 of the lensing galaxy. The idea is that the family 
of models with different values of k have different amounts of mass in the primary lensing 
galaxy. This mass can be scaled to a 2 through the virial theorem. We discuss this approach 
in §6.1| and show how the method would work in the case of the SPLS model. Equation ||6-7| 
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gives the final result. Rhee (1991) measured a of the stars in the lensing galaxy in 0957+561 
to be 303 ± 50 kms -1 . Unfortunately, the measurement is not sufficiently precise to provide 
a significant constraint on H . A more precise measurement of a should be possible with a 
large optical telescope and is highly desirable, as it would eliminate this last uncertainty in 
the modeling of 0957+561. 

The velocity dispersion a refers only to the stars in the lensing galaxy, whereas the 
gravitational lensing is done by the total mass. There has been some uncertainty as to how 
the measured stellar a should be related to the a of the total mass, which is the relevant 
quantity for normalizing the galaxy mass distribution. The straightforward approach is to 
assume that the velocity dispersion of the stars and that of the dark matter particles are 
equal. However, Turner et al. (1985) argued that in many circumstances the a of the stars 
would be lower than that of the total mass by a factor of (2/3) 1 / 2 . This makes quite an 
important difference to the results. For instance, Narayan (1991) derived on the basis of 
the FGS model, assuming Atba = 536 days QPress et al. 1992b] ) and a = 303 kins" 1 QRhee 



1991| ), that H = 37 kms 1 Mpc 1 if the dark matter has the same dispersion as the stars 



and H = 56 kms -1 Mpc -1 if the correction factor of (2/3) 1//2 is applied. The approach used 
in this paper, based on the work of Kochanek (1993), avoids the ambiguity since it is based 
on a fundamental application of the virial theorem. Our model here gives, for the same 
parameters as the ones employed by Narayan (1991), Hq = 55 kms -1 Mpc -1 . 

Narayan (1991) has discussed an interesting additional benefit that one obtains by 
measuring a. Once the mass of the lens has been normalized through such a measurement, 
it is possible to obtain an estimate of H$ that is independent of the source redshift. In 
other words, the distance to the source drops out of the relations. This is, of course, not a 
particular advantage since most often the source redshift is known. However, a corollary of 
the theorem is that if there are additional mass sheets with shear between the lens and the 
source, say due to other clusters, the formula for H is transparent to their presence provided 
the additional clusters are describable by quadratic potentials over the angular extent of the 
lensed images. This theorem is quite useful. There is evidence for a second cluster at redshift 
0.51 in the field of 0957+561 (|BTK| , |Angonin-Willaime et al. 1994Q , and there may well be 



other clusters at higher redshift. It is advantageous to be able to estimate Ho independently 
of these complications. Another interesting result is that with a measurement of o one 
directly obtains the angular diameter distance Dd to the lens regardless of cosmological 
model, i.e. the result is independent of the values of q and A ( [Narayan 199 1| ) . 

The k degeneracy can also be eliminated by estimating the mass surface density of 
the cluster directly. One simple method is to measure the core radius of the cluster, its 
velocity dispersion, and the location of the lens relative to the cluster center. We describe 
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in §6^2 how this information can be translated into an estimate of k. Using the parameter 
values given by Rhee et al. (1995), we estimate that k ~ 0.1 — 0.2 for the lensing cluster 
in 0957+561. This translates to H Q = (47 — 58) kms" 1 Mpc _1 for Atba — l-5yr and 
Ho = (64 — 80) kins -1 Mpc -1 for Atba = 11 yr. A more ambitious undertaking is to map 
the two-dimensional surface density of the cluster using the weak distortions of background 
galaxies. The idea for this method goes back to Tyson, Wenk, & Valdes (1990) and Kaiser 
& Squires (1993). Rhee et al. (1995) are currently applying the method to the field around 
0957+561 and results are awaited. 

What is the future for lens-based measurements of the Hubble constant? We believe 
0957+561 will deliver a result soon. It is only a matter of time before the controversy over 
the time delay is settled. Our model (cf. eq. ||4-14| ) will then directly provide an upper bound 



on Hq or a direct estimate if we take the value of n ~ 0.1 — 0.2 mentioned above. Measuring 
a and thereby obtaining a more reliable estimate of n is more challenging, but efforts are 
under way and once again we are optimistic. With measurements of both Atba and a, 
our model should provide an estimate of Ho which would be quite competitive with other 
determinations. There is unfortunately one major remaining uncertainty, namely whether 
our model captures the mass distribution of the lens sufficiently well. The poor reduced x 2 of 
our model is certainly a concern. Perhaps a mass-traces-light model will improve matters, or 
perhaps future observations will resolve our SPLS discrepancies. In any case, it is desirable 
to study additional lenses in order to obtain other independent estimates of H . 

It is generally agreed that 0957+561 is not the best source for estimating Ho since the 
presence of the cluster adds an extra layer of complication. Several other good candidates 
are available where the lensing galaxy appears to be more or less isolated. Such systems are 
easier to model. The radio ring sources are particularly promising since the modeling of these 
is likely to be more reliable than with multiply-imaged optical quasars. The experience with 
0957+561 shows that having information on the full relative magnification matrix and its 
spatial derivative is invaluable for constraining the mass model. Therefore, resolved sources 
which can be mapped and modeled in detail are likely to be much superior to unresolved 
sources. 

To conclude, we emphasize that the lens-based method of estimating Ho is completely 
independent of all other methods and works directly on high redshift sources without using 
any intermediate distance ladders. Furthermore, the method is based on very basic geometry 
and physics. These are substantial advantages, and we feel that the method deserves to be 
pursued seriously. 
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A. Approximating a Softened Power-Law Homoeoidal Lens 

A.l. Lensing Properties of the SPLH 

As we often observe galaxies to have elliptical isophotes, it would be appealing to model 
their lensing properties with a mass distribution having elliptical isodensity contours. A 
simple example is the homoeoid, whose isodensity contours are concentric ellipses of constant 
ellipticity and position angle. The surface density of the homoeoid varies only as a function 
of the elliptical "radius" r em from the center of the distribution: 

r em = JY+^ + (l- e )2 J (M) 

where the parameter e reflects the degree of flattening. In the above equation and through- 
out the remainder of this appendix, we assume that the major axis of the elliptical profile 
is aligned with the x-axis of our coordinate system. From equation [ |A1|| , we relate the 
asphericity parameter e to the axis ratio of isodensity contours: 



a 1 + e 

We emphasize that e is not equivalent to the ellipticity e, defined by e = 1 — (b/a). 

Bourassa & Kantowski (1975) found that the lensing properties of generic elliptical mass 
distributions may be compactly expressed by adopting a complex angular notation, where 
vector angles map to the complex plane according to i— ► z — 9 X + id y . They derived the 
complex ray deflection a*(z) for a general homoeoidal density profile ft(r em ) = S(r em )/S cr : 



a (z) 



with z as the complex conjugate of z. Here S denotes the branch of the complex square root 
for which S^/z 1 — 4ep 2 and z lie in the same quadrant of the complex plane (Bray 1984). 
As noted by Schramm (1994), closed solutions of equation ||A2|| for general elliptical lenses 
have rarely been found (Narasimha 1982, KK| ) and are by construction restricted to density 



profiles with homoeoidal symmetry. 

I I _ 

KK have shown that equation JA2J has a closed form for softened isothermal homoeoids, 
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whose density profiles are characterized by central convergence k and core radius 6 C : 

K 9 c 



H + 9 2 

em 1 w c 



(A3) 



The complex ray deflection for the softened isothermal homoeoid is 



a*(z) 
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where x = Re(z), y = Im(z), and «e = 26* c k is the asymptotic Einstein radius for 6 C = e 

We have found that a closed form of the ray deflection equation ||A2|| also exists for the 
singular power-law homoeoids. The density profile of the singular power-law homoeoid is 
parameterized by asymptotic (e — ► 0) Einstein radius cue and power-law index r\ of radial 
mass increase: 

(A5) 
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Substituting the surface density from equation ||A5|| into the integrand of equation A~2 
obtain the complex ray deflection of the singular power-law homoeoid: 



we 
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where \z\ = V zz and 2 -Pi is the complex hypergeometric function. In the limit of azimuthal 
symmetry (e = 0), r em =\z\ and the hypergeometric function becomes unity. Equation |[A6|| 
therefore reduces to a*(z) = (\z\/aE) v ~ 2 z, which may be compared with the deflection of a 



singular power-law sphere (eq. 



with 6 = 0). 



A. 2. Tilted Plummer Elliptical Potentials 

To overcome the difficulties of elliptical mass distributions, we turn to elliptical poten- 
tials. Such potentials vary only as a function of the elliptical radius r 2 p = [x 2 {l — e p ) + 
y 2 (l + e p )], characterized by asphericity parameter e p . Although the lensing properties of 
elliptical potentials are easily expressed in closed form, their associated isodensity contours 
are sometimes unphysical. Elliptical potentials with more than a moderately high aspheric- 
ity ( e P ^ 0.2) require isodensity contours that are dumbbell-shaped (|KK] ). In certain cases, 
highly flattened elliptical potentials may even require negative isodensity contours ([Blandford 
|fc Kochanek 19871 ). However, elliptical potentials provide quite an accurate representation of 
elliptical mass distributions for ellipticities e ^ 0.3. Fortunately for this study, the 0957+561 
lensing galaxy Gl has an isophotal ellipticity compatible with this limit. 
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Isothermal (rj = 1) elliptical potentials have been popular in the modeling of aspherical 
galaxies (e.g. Blandford & Kochanek 1987; Kochanek & Blandford 1987; |Wallington fc| 



INarayan 1993| ), but we seek an elliptical potential which has the radial power-law generality 
of the SPLH. The family of tilted Plummer potentials fits this description: 



or 
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(A7) 



where the potential expressed in units of radian 2 , is equivalent to ip/c 2 in the notation of 
§|2|. Here uo p represents the core radius of the potential, and we recycle the SPLS parameters 
«e and rj to represent an effective Einstein radius and radial power-law index, respectively. 
Taking the gradient of equation [A7|, we find that the ray deflection of the tilted Plummer 
model is given by 

~ m-e p )x\ 



or 



(A* 



Comparing the asymptotic (u p , e p , 9 C — > 0) behavior of equation ||A8|| with the SPLS analogue 
(eq. |[4-6|1 ), we see that the tilted Plummer model parameters oie and rj and their SPLS 
counterparts are equivalently defined. 

In order to relate the core radius parameters u v and 9 C , we follow the convention of 



KK| , who require the equivalence of the lens central convergence kq. The Poisson equation 

yields the convergence for the tilted Plummer model: 



in conjunction with equation 

f ul + x 2 (l - e p ) [( V /2)(l - e p ) + e p ] + y 2 (l + e p ) [( V /2)(l + e p ) 
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We obtain the convergence for the SPLH model by adding a finite core radius to the singular 
power-law homoeoid (eq. |[A3|| ) : 




(A10) 



With somewhat more effort, it may also be shown that equation [ |A1(J| ] is the elliptical gen- 
eralization (r — > r em ) of the SPLS convergence obtained from equation |[4-3|| divided by the 
critical density S cr . Equating the central convergences of the SPLH (eq. ||A10|| ) and tilted 
Plummer potential (eq. |[A9|| ) gives the desired relation between the respective core radii: 




(All) 
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Finally, we must determine the appropriate value for the tilted Plummer potential as- 
phericity e p . We seek the e p for which the axis ratio of isodensity contours at large radius, 
(6/0)00, is equal to the observed isophotal axis ratio (b/a) = 1 — e. Making use of equation 



in the limit uj v 



0, we obtain 



1/2 r 



l + e p J Ll + e p (2/»7-l) 



1 

2- n 



(A12) 



As an aside, we note that equation [ A12 | with 77 = 
noted by |KK| : 

fb\ (l-tr 



1 simplifies to the isothermal relation 



3/2 



l + e t 



With equations ||A11|| and [|A12|| , we may now determine the set of tilted Plummer potential 
parameters (a> p , 77, oe, and e p ) which approximate an arbitrary softened power-law homoeoid 
specified by radial parameters (9 C , rj, and qe) and isodensity ellipticity e. 
I I 

KK show that the tilted Plummer potential and SPLH potential approximately coincide 
when the asphericities and core sizes are sufficiently small. In particular, the sizes of the 
respective tangential caustics coincide. The Gl core sizes compatible with nondetection of a 
third QSO image (6 C <^ O'.'l) are indeed small, and even at a Gl mass ellipticity of 0.30 there is 
only minor deviation between the models' isodensity contours (cf. [KK| Fig. 4). We therefore 
conclude that our treatment of Gl as an SPLH is not compromised by approximating the 
lensing properties with a suitably chosen tilted Plummer potential. 



-36- 



Table 1: 0957+561 Image Positions and Flux Densities from VLBI 



Emission 


Flux Density 


Radius 


Pos. Ang. 


Component 


(mJy) 


(mas) 


(°) 


A t 


14.2 ± 0.1 








A, 


10.6 ± 0.2 


48.3 ± 0.1 


19.9 ± 0.1 


Bt 


11.4 ± 0.1 








B 5 


7.0 ± 0.5 


58.8 ± 0.1 


17.8 ± 0.1 



Note. - Shown here are the two brightest pairs of emission regions identified in the six-component 



Gaussian model fitted by G94 to 0957+561A,B. The radius and position angles are given as offsets from the 
respective QSO central regions Ai and B\. 

Table 2: 0957+561 Image Magnification Constraints 
A. Relative Magnification Matrix Elements 



Quantity Measured Value 



Mi 




1.23 ± 0.04 


M 2 




-0.50 ± 0.03 


0i ( 


°) 


18.6 ± 0.1 


02 ( 


°) 


118 ± 6 


Mi 


(lO^mas- 1 ) 


0.5 ± 1.5 


M 2 


(lO^Hias^ 1 ) 


2.6 ± 0.8 



B. Correlation Coefficients 



Covar 


M x 


M 2 


01 


02 


M x 


M 2 


M x 


1.00 












M 2 


0.46 


1.00 










01 


-0.39 


-0.38 


1.00 








02 


-0.79 


-0.61 


0.21 


1.00 






M x 


0.96 


0.40 


-0.22 


-0.73 


1.00 




M 2 


0.70 


0.79 


-0.26 


-0.70 


0.70 


1.00 



Note. — Magnification matrix information as measured by G94 for 0957+56lA,B. Here Mx,2 are the 
matrix eigenvalues from A§ to 4>i.2 are the eigenvector position angles. Mi t 2 are spatial derivatives, 
taken upward along the A jet, of the eigenvalues of the relative magnification matrix. Also given is the 
normalized error covariance matrix for these values. 



Table 3: Offset of lensing galaxy center of brightness from 0957+56 IB 



Observation Designation x offset y offset 

Optical* Gl 0"19 ± 0"03 l'.'OO ± 0'.'03 

VLA b G 0'.'151 ± O'.'OOl l'.'OSl ± O'.'OOl 

VLBF G' 0'.'181 ± O'.'OOl l'/029 ± O'.'OOl 



"Stockton 1980; Seeing conditions better than 0'.'5 
Roberts et al. 19851; A = 6 cm 



c Gorcnstcin et al. 1988a; A = 13 cm 



Table 4: Best-Model Estimations and Goodness-of-Fit 

. 1T1 (Obsvd. - Estd.) 

Observable Model Estimate 



Obsv. Err. 



Image Separations 3 ": Contributed \ — 4-5 



A l -B 1 


(-1'.'25271, 6'.'04662) 


(~10" 4 , ~ 


io- 4 ) 


Gl - B x 


(0'.'215, l'/057) 


(0.84, 


1.91) 


A 5 -A x 


(16.4, 45.4) mas 


(0.013, 0.026) 


B 5 — Bi 


(18.0, 56.0) mas 


(0.002, - 


-0.26) 


Magnifications and Gradients: Contributed \ 2 - 


= 21.5 


M x 


1.244 




-0.34 


M 2 


-0.529 




0.95 




18?69 




-0.93 




108° 




1.66 


M x 


(0.97 x lO-^mas" 1 




-0.27 


M 2 


(0.99 x lO-^mas" 1 




1.8 


\Wcb\\ 


<io- 10 








a Expressed in the Cartesian coordinate notation (x, y) described in §|| 
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Table 5: Fitted SPLS Parameters for the 0957+561 Lens System 



Symbol 


Best-Fit 95% Conf. Limits a 


Leasing Galaxy (Gl) 


a E 


2"587 2"560 < a E < 2"690 




O'.'OO 0" <9 C < 0'.'105 


V 


1.165 1.055 < rj < 1.176 


External Shear 


i 


0.224 0.220 < i < 0.237 




-64?40 -65?13 <<p< -63?31 



a Non-Gaussian, see |5.1.1 for details 



Table 6: Fitted FGS Model Parameters Compared with Previous Estimates 



Paramter 


Best-fit 


95% Conf. Limits 


Previous Best-fit a 




1'.'56 


C/!99 <6 C < 2'.'05 


2'.'9±0 / .'l 


a v (km s _1 ) 


340.5 


336 <a v < 355 


390±4 


M bh (1O 9 M ) 


111 


80 < M bh < H4 


115±1 


i 


0.273 


0.266 < i < 0.278 


0.18±0.01 




-64?9 


-67?4 < <p < -62? 1 


-63?3±0?6 



a From FGS Table 3; variations are due to differing lensing notation conventions. 



-39- 



Table 7: Results for Models with Gl Compact Nucleus 





Lowest x 2 


Lowest x 2 


"Isothermal" 


M bh = M b F h GS 


Symbol 


SPLS 


FGS 


SPLS 


SPLS 


M bh (1O 9 M ) 


27.2 


110.9 


78.8 


110.9 (fixed) 


o c 


O'.'OOO 


1"56 


0'.'714 


l'/33 


V 


1.38 


N/A 


1 (fixed) 


0.256 


«E 


2"44 


N/A 


3'.' 10 


6'.'34 


1 


0.194 


0.273 


0.238 


0.278 




-65?43 


-64?86 


-65?03 


-64? 74 


X 2 /d.f. a 


5.5 


5.7 


4.9 


5.6 


^1.5 


0.502 


0.732 


0.627 


0.745 


The first model has four degrees 


of freedom; the 


rest have five. 





Table 8: Results for Models with Gl Ellipticity and Perturbed Cluster 





e = 0.30 


Perturbed Cluster 


Symbol 


SPLH a 


SPLS b 


e c 


O'.'OOO 


O'.'OOO 


V 


1.157 


1.159 




2'.'50 


2'.'33 


i 


0.224 


0.205 




-76?95 


-61?05 


X 2 /d.f. c 


3.8 


3.4 




0.615 


0.582 



a Described in §5.4. 
6 Described in § |5.5| . 

c Both models have six degrees of freedom. 
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Figure Captions 

Fig. 1. — Contours of x 2 for the best-fit SPLS models as a function of the mass power- law 
exponent rj and core radius 9 C . Although the overall best model has zero core radius, there 
is a "valley" of low Ax 2 extending to (9 C ~ 110 mas. The valley is truncated at larger core 
radii because these models do not sufficiently demagnify the (unseen) third QSO image. 

Fig. 2. — Similar to Figure [I], except that contours of hi.5 (solid lines) are overlaid on the 
contours of lowest x 2 (dotted lines) for fixed values of the mass power-law exponent r] and 
core radius 9 C . 

Fig. 3. — Curve showing the lowest x 2 (abscissa) obtainable for SPLS models producing a 
given (ordinate). The shallow and then steepening rise to the right of the minimum is 
caused by our unorthodox penalty assignment for third image flux (§ |3.2| ). As can be seen 
in Figure 0, the models giving larger h\£ values correspond to increasing core radius, which 
are chiefly penalized because they allow increasing flux for the (unseen) third QSO image. 

Fig. 4. — Contours of model-predicted Gl stellar velocity dispersion o (in km s _1 , solid 
lines) are overlaid on the contours of lowest x 2 (dotted lines) for fixed values of the mass 
power-law exponent rj and core radius 8 C . We assume isotropic stellar orbits and a 1" slit for 
the measurement. 

Fig. 5. — Curves of x 2 versus model-predicted Gl stellar velocity dispersion (v 2 ^) 1 ^ 2 = a for 
orbit anisotropics q = —0.2 (dotted), (solid), and 0.2 (dashed). The three cases represent 
stellar orbits which are slightly tangential, isotropic, and slightly radial, respectively. 
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